{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "GD9gUQpaBxNa"
   },
   "source": [
    "# Train YOLOv7 on a Custom Dataset\n",
    "\n",
    "[YOLOv7](https://blog.roboflow.com/yolov7-breakdown/) is a state-of-the-art realtime [object detection](https://blog.roboflow.com/object-detection/) model. In this notebook, we'll train YOLOv7 on a custom dataset prepared in [Roboflow](https://roboflow.com/?ref=studiolab) using [AWS Studio Lab](https://studiolab.sagemaker.aws/). At the end, we'll have a model that can find our objects of interest in images and videos.\n",
    "\n",
    "Many thanks to WongKinYiu and AlexeyAB for writing [the YOLOv7 paper](https://arxiv.org/abs/2207.02696) and releasing [the code](https://github.com/WongKinYiu/yolov7).\n",
    "\n",
    "## **Steps Covered in this Tutorial**\n",
    "\n",
    "To train our detector, we will take the following steps:\n",
    "\n",
    "* Install YOLOv7 dependencies\n",
    "* Create or choose an open source dataset\n",
    "* Load dataset in YOLOv7 format\n",
    "* Run YOLOv7 training\n",
    "* Evaluate YOLOv7 performance\n",
    "* Run YOLOv7 inference on test images\n",
    "* Download Weights for Deployment (Optional)\n",
    "* Implement Active Learning to Improve our Model (Optional)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7mGmQbAO5pQb"
   },
   "source": [
    "# Install YOLOv7 Dependencies\n",
    "\n",
    "## Connecting a GPU\n",
    "\n",
    "Training will go much faster if you use a GPU Runtime.\n",
    "\n",
    "From the Studio Lab homepage, select `GPU` as your `Compute type`.\n",
    "\n",
    "<div><img src=\"https://i.imgur.com/LHpjx5x.png\" style=\"max-width: 450px;\"></div>\n",
    "\n",
    "To verify that you're running with a GPU, run the following command and ensure that it outputs your GPU stats (if you're running in a CPU runtime, this will give an error message instead):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Wed Nov 30 17:57:40 2022       \n",
      "+-----------------------------------------------------------------------------+\n",
      "| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |\n",
      "|-------------------------------+----------------------+----------------------+\n",
      "| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n",
      "| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n",
      "|                               |                      |               MIG M. |\n",
      "|===============================+======================+======================|\n",
      "|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |\n",
      "| N/A   21C    P0    25W /  70W |      0MiB / 15109MiB |      0%      Default |\n",
      "|                               |                      |                  N/A |\n",
      "+-------------------------------+----------------------+----------------------+\n",
      "                                                                               \n",
      "+-----------------------------------------------------------------------------+\n",
      "| Processes:                                                                  |\n",
      "|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n",
      "|        ID   ID                                                   Usage      |\n",
      "|=============================================================================|\n",
      "|  No running processes found                                                 |\n",
      "+-----------------------------------------------------------------------------+\n"
     ]
    }
   ],
   "source": [
    "!nvidia-smi"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If there are no GPUs available, you will still be able to run this notebook, but training time will be greatly increased.\n",
    "\n",
    "## Cloning the YOLOv7 Repo\n",
    "\n",
    "Next, we'll pull down [the YOLOv7 repo](https://github.com/WongKinYiu/yolov7) from Github and install its dependencies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "nD-uPyQ_2jiN"
   },
   "outputs": [],
   "source": [
    "# Save the working directory path for later use\n",
    "import os\n",
    "HOME = os.getcwd()\n",
    "\n",
    "# Download YOLOv7 repository and install requirements\n",
    "!git clone https://github.com/WongKinYiu/yolov7\n",
    "%cd yolov7\n",
    "!pip install -r requirements.txt --ignore-installed\n",
    "!pip install torch==1.12.1 torchvision==0.13.1 --ignore-installed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Installing Roboflow\n",
    "\n",
    "We'll be using [Roboflow](https://roboflow.com/?ref=studiolab) to prepare and host our custom object detection dataset (and, optionally, to intelligently sample more images during inference to improve our dataset).\n",
    "\n",
    "The [`roboflow` pip package](https://blog.roboflow.com/pip-install-roboflow/) will load our dataset in the correct format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install roboflow --ignore-installed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Preparing a Custom Dataset\n",
    "\n",
    "In order to train YOLOv7, we'll need a dataset which is composed of three parts:\n",
    "\n",
    "1. Images - ideally very similar to the ones our model will make predictions on.\n",
    "2. Annotations - special TXT files describing bounding boxes that our model will use to learn what it's looking for.\n",
    "3. A YAML File - contains configuration and metadata needed by YOLOv7 to understand our images and annotations.\n",
    "\n",
    "Our model will learn to fit the data in the training set, and evaluate its results against the validation set. At the end of the tutorial, we will try our model on the held-out test set to preview how it might perform in the wild when making predictions on images it has never seen before.\n",
    "\n",
    "We have two options for sourcing our dataset. We can create one from scratch (using our own images, and labeling it with our own annotations), or choose one that's been prepared and open sourced by someone else.\n",
    "\n",
    "## Option 1: Create a YOLOv7 Dataset with Your Own Images\n",
    "\n",
    "If you have our own images (and, optionally, annotations), click the three dots to expand the instructions for preparing your own dataset:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "_**Note:** If you're running on Sagemaker Studio Lab you will probably want to stop your runtime while you create your dataset so you don't waste your quota. Depending on the size and complexity of your dataset, these steps may take a while to complete._\n",
    "\n",
    "### Step 1: [Sign up for a free Roboflow Account](https://app.roboflow.com/?ref=smsl-yolov7)\n",
    "\n",
    "[Roboflow](https://roboflow.com/?ref=studiolab) is an end-to-end computer vision platform. It helps you [create](https://docs.roboflow.com/quick-start?ref=studiolab), [understand](https://blog.roboflow.com/dataset-search/?ref=studiolab), and [use](https://docs.roboflow.com/exporting-data?ref=studiolab) image datasets to train and deploy custom models.\n",
    "\n",
    "Roboflow strives to be broadly interoperable and can import and export object detection datasets in [dozens of formats](https://roboflow.com/formats?ref=studiolab). They maintain [training notebooks](https://models.roboflow.com/?ref=studiolab) (like this one) for many state of the art computer vision models, and also offer [AutoML training](https://docs.roboflow.com/train?ref=studiolab) which can be useful for [prototyping](https://blog.roboflow.com/deploy-tab/?ref=studiolab), [model assisted labeling](https://roboflow.com/annotate?ref=studiolab), and even [deploying to a wide range of targets and edge devices](https://roboflow.com/deploy?ref=studiolab).\n",
    "\n",
    "In this tutorial, we will use Roboflow to annotate a custom dataset and export it for use with YOLOv7 in this notebook. But we encourage you to explore [its other features](https://roboflow.com/features?ref=studiolab) as well.\n",
    "\n",
    "### Step 2: Create a Public Workspace\n",
    "\n",
    "Roboflow offers a [generous free tier](https://roboflow.com/pricing?ref=studiolab) if your data can be shared publicly with others on [Roboflow Universe](https://universe.roboflow.com/?ref=studiolab). There are also paid plans available for private data.\n",
    "\n",
    "For this tutorial you'll need to create a Public workspace. Be sure to give it a good name; it will serve as your Universe username where you can showcase your work.\n",
    "\n",
    "<div><img src=\"https://i.imgur.com/zfE5MZL.png\" style=\"max-width: 600px;\"></div>\n",
    "    \n",
    "### Step 3: Create a Project\n",
    "\n",
    "Then, create an `Object Detection` project (be sure to give it a descriptive name, and fill in the `What will your model predict?` section since they will make your project more understandable and be pulled in via the API later, and can serve as helpful metadata later for advanced use-cases like automated prompt engineering for zero-shot models).\n",
    "\n",
    "<div><img src=\"https://i.imgur.com/O2xDyxQ.png\" style=\"max-width: 500px;\"></div>\n",
    "\n",
    "### Step 4: Upload your Images\n",
    "\n",
    "Next, drop image (or video) files into the UI and, optionally, drop existing annotations in [any supported format](https://roboflow.com/formats?ref=studiolab). Alternatively you can use the [Upload API](https://docs.roboflow.com/adding-data/upload-api?ref=studiolab) or [load images from an S3 bucket](https://blog.roboflow.com/how-to-use-s3-computer-vision-pipeline/?ref=studiolab).\n",
    "\n",
    "<div><img src=\"https://i.imgur.com/hWrhtNj.png\" style=\"max-width: 600px;\"></div>\n",
    "\n",
    "Then click `Finish Uploading` to add the images to your Roboflow project.\n",
    "\n",
    "<div><img src=\"https://i.imgur.com/9hUh9j5.png\" style=\"max-width: 220px;\"></div>\n",
    "\n",
    "_**Note:** To get good, generalizable results you will need lots of images covering a wide variety of situations and edge cases. Exactly how many images you need [depends on a wide variety of factors](https://blog.roboflow.com/images-train-model/?ref=studiolab), but we recommend starting out with at least 200 for most use-cases. If you need more images, try sourcing from open source datasets on [Roboflow Universe](https://universe.roboflow.com/?ref=studiolab) with images similar to yours._\n",
    "\n",
    "### Step 5: Annotate\n",
    "\n",
    "_**Note:** If you imported annotations from another labeling tool or open source dataset, you can skip this step._\n",
    "\n",
    "Now, we'll use [Roboflow Annotate](https://roboflow.com/annotate?ref=studiolab) to create annotations that will teach our model what we're trying to detect in our images. Since your model will learn to mimic your annotations, it's important that you give some thought to how you label your images ahead of time. We've compiled a list of [best practices to consider when labeling images](https://blog.roboflow.com/tips-for-how-to-label-images/?ref=studiolab).\n",
    "\n",
    "You can either choose to annotate the images yourself, or invite a friend to your workspace to help out for double the fun.\n",
    "\n",
    "<video loop autoplay controls src=\"https://i.imgur.com/AuDAPs4.mp4\">Annotate Images</video>\n",
    "\n",
    "_**PROTIP for YOLOv7:** if your objects of interest don't fit nicely into bounding boxes, consider [annotating with polygons](https://blog.roboflow.com/polygons-object-detection/?ref=studiolab). The modern data processing pipeline in YOLOv7 can benefit from the additional information polygons provide to achieve more accurate training results._\n",
    "\n",
    "### Step 6: Add to Dataset\n",
    "\n",
    "Once you have annotated 200 images or more, you're ready to add them to your dataset. At this point you'll choose how to split your images into train, valid, and test sets. We usually recommend keeping the 70/20/10 ratio unless you have more than a few thousand images, in which case you might want to add a higher percentage to your test set.\n",
    "\n",
    "<div><img src=\"https://i.imgur.com/XUVE6uq.png\" style=\"max-width: 400px;\"></div>\n",
    "\n",
    "If you're an advanced user, you can also choose to craft these sets by hand to verify that your model is going to generalize well.\n",
    "\n",
    "### Step 7: View the Health Check (Optional)\n",
    "\n",
    "Roboflow includes a [dataset health check](https://docs.roboflow.com/dataset-health-check?ref=studiolab) which gives information about your class balance, box distribution, and image sizes. Checking this can help you choose good preprocessing and augmentation steps while generating a version of your dataset which can improve your model's performance.\n",
    "\n",
    "For example, if you discover your objects are all clustered around the center of the image, you may want to [apply a static crop](https://docs.roboflow.com/image-transformations/image-preprocessing#static-crop) so your model can focus only on the important parts of the image. Or if your images have giant dimensions compared to your objects of interes, [you might want to apply tiling](https://blog.roboflow.com/detect-small-objects/?ref=studiolab).\n",
    "\n",
    "### Step 8: Generate a Version\n",
    "\n",
    "Roboflow helps you version control your datasets so that you can get repeatable results & track changes and performance over time. It also lets you preprocess and augment your images which can speed up your training time and improve your model's results.\n",
    "\n",
    "Choose your preprocessing steps. For YOLOv7, we recommend [Auto-Orient](https://blog.roboflow.com/exif-auto-orientation/?ref=studiolab) and Resize (Stretch to 640x640), which is YOLOv7's default input size. You may also want to [enable Tiling](https://blog.roboflow.com/edge-tiling-during-inference/?ref=studiolab) if your objects are very small compared to the size of your images (if you're not sure, try training a model first without, and if you get poor results you can try again).\n",
    "\n",
    "<div><img src=\"https://i.imgur.com/SZojEwP.png\" style=\"max-width: 400px;\"></div>\n",
    "\n",
    "Next, choose your desired augmentations. YOLOv7 does online augmentations during training automatically, so if you have more than 500 annotated images in your dataset you can usually skip this step. But for smaller datasets we've seen good results adding some basic augmentations which mimic things your model may see in the while. We recommend **not** adding Cutout or Mosaic for YOLOv7 as these are already applied during training and applying them twice produces poor results.\n",
    "\n",
    "Finally, click `Generate` to lock in your choices and render your images.\n",
    "\n",
    "### Step 9: Export for YOLOv7\n",
    "\n",
    "Once you've generated a dataset version, click `Export` and select the `YOLO v7 PyTorch` format and the `show download code` option.\n",
    "\n",
    "<div><img src=\"https://i.imgur.com/DXcJpUE.png\" style=\"max-width: 500px;\"></div>\n",
    "\n",
    "This will convert your dataset to the proper format and make it available for use in this notebook.\n",
    "\n",
    "### Step 9: Copy Your Snippet\n",
    "\n",
    "Now you're all set, simply copy and paste the download snippet from the `Jupyter` tab into the code cell below and you're ready to train your custom YOLOv7 model.\n",
    "\n",
    "<div><img src=\"https://i.imgur.com/hZJYgdy.png\" style=\"max-width: 500px;\"></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Option 2: Use an Open Source Dataset\n",
    "\n",
    "[Roboflow Universe](https://universe.roboflow.com/?ref=studiolab) is the world's largest repository of open source datasets. There are [over 100,000 datasets to choose from](https://blog.roboflow.com/computer-vision-datasets-and-apis/?ref=studiolab) for [a plethora of use-cases](https://universe.roboflow.com/browse?ref=studiolab).\n",
    "\n",
    "If you'd like to start from an open source dataset, click the three dots to expand the instructions:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Step 1: Find a Dataset\n",
    "\n",
    "With over 100,000 open source datasets (and over 10,000 of those with a pre-trained model you can try in your browser), it's a near-certainty that you can find a starting point for nearly any computer vision problem on Roboflow Universe.\n",
    "\n",
    "If you're looking for inspiration, browse [some of the most recently updated projects](https://universe.roboflow.com/search?q=object%20detection%20images%3E%3D100%20images%3C%3D1000%20has%3Amodel&ref=studiolab) or check out some of our [curated collections of datasets](https://universe.roboflow.com/browse?ref=studiolab).\n",
    "\n",
    "### Step 2: Explore its Contents (Optional)\n",
    "\n",
    "Once you've found something that looks interesting, you can [explore the images using Dataset Search](https://blog.roboflow.com/dataset-search/?ref=studiolab) to make sure its images cover all the cases you're looking for.\n",
    "\n",
    "If nothing is quite right, you may want to combine specific images from multiple datasets into a new custom project instead. For example, [the Microsoft COCO Dataset](https://blog.roboflow.com/coco-dataset/?ref=studiolab) doesn't have a class for graffiti, but [it has many images containing graffiti](https://universe.roboflow.com/jacob-solawetz/microsoft-coco/browse?queryText=graffiti&pageSize=50&startingIndex=0&browseQuery=true&ref=studiolab) that you could use to bootstrap a new dataset along with some [other open source graffiti datasets](https://universe.roboflow.com/search?q=graffiti%20object%20detection%20images%3E100&ref=studiolab).\n",
    "\n",
    "### Step 3: Try a Pre-Trained Model in Your Browser (Optional)\n",
    "\n",
    "Since many other users of Roboflow Universe have already trained a model on their datasets, you can oftentimes get a sneak preview of how your YOLOv7 model might perform by trying it out in your webcam or on some sample images using [the project's Model tab](https://blog.roboflow.com/deploy-tab/?ref=studiolab).\n",
    "\n",
    "<video loop autoplay controls src=\"images/deploy-tab.mp4\">Try a Pre-Trained Model in Your Browser</video>\n",
    "\n",
    "Even though these pre-trained models don't use YOLOv7, oftentimes the limiting factor of prediction quality is in the data, not the model. So they can give you some preliminary intelligence on how well your YOLOv7 model might perform if you train it on that dataset.\n",
    "\n",
    "### Step 4: Choose a Dataset Version\n",
    "\n",
    "Many datasets on Roboflow Universe have multiple versions that were generated with different settings (eg different image sizes, and augmentation steps) and, potentially, with different sets of images as the maintainer [added more through active learning](https://blog.roboflow.com/computer-vision-active-learning-tips/?ref=studiolab) over time. Browse through the options available and choose one that looks good to you. For YOLOv7 we recommend an image size of 640x640 or higher for best results.\n",
    "\n",
    "_**PROTIP:** If multiple versions of the dataset have trained models on Roboflow Universe, you may want to choose the one that achieved the highest mean average precision as it can be a signal of higher._\n",
    "\n",
    "### Step 5: Get Your Download Snippet\n",
    "\n",
    "When you've found a dataset version that suits your needs, click `Download` and select the `YOLO v7 PyTorch` format. This will give you a code snippet that will load your dataset into this notebook when you paste it into the code cell below.\n",
    "\n",
    "<div><img src=\"https://i.imgur.com/hZJYgdy.png\" style=\"max-width: 500px;\"></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "mtJ24mPlyF-S",
    "tags": []
   },
   "source": [
    "# Load Dataset in YOLOv7 Format\n",
    "\n",
    "Now that we have a dataset, we'll download it into our notebook environment in the right format. Use the `YOLO v7 PyTorch` export option in Roboflow.\n",
    "\n",
    "The YOLOv7 requires YOLO TXT annotations, a custom YAML file, and organized directories. Roboflow creates and hosts this for us and helps download and configure it for our model to consume.\n",
    "\n",
    "**Copy and paste your code snippet from Roboflow into the code cell below.** The snippet contains a reference to the dataset and an API Key that will authorize you to access your private data from Roboflow (if applicable) and perform (optional) advanced options later."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ovKgrVN8ygdW"
   },
   "outputs": [],
   "source": [
    "# REPLACE this cell with your custom code snippet generated in the steps above\n",
    "\n",
    "\n",
    "from roboflow import Roboflow\n",
    "rf = Roboflow(api_key=\"dkfYX3lI855vdjisgMWb\")\n",
    "project = rf.workspace(\"markmcquade\").project(\"boxpunch-detector\")\n",
    "dataset = project.version(2).download(\"yolov7\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "bHfT9gEiBsBd"
   },
   "source": [
    "# Run YOLOv7 Custom Training\n",
    "\n",
    "We're ready to start custom training.\n",
    "\n",
    "NOTE: We will only modify one of the YOLOv7 training defaults in our example: `epochs`. We will adjust from 300 to 100 epochs in our example for speed. If you'd like to change other settings, see details in our [how to train YOLOv7 blog post](https://blog.roboflow.com/yolov7-custom-dataset-training-tutorial/?ref=studiolab)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "bUbmy674bhpD"
   },
   "outputs": [],
   "source": [
    "# download COCO starting checkpoint\n",
    "%cd {HOME}/yolov7\n",
    "!wget \"https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7.pt\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "1iqOPKjr22mL"
   },
   "outputs": [],
   "source": [
    "# run this cell to begin training\n",
    "%cd {HOME}/yolov7\n",
    "!python train.py --batch 8 --cfg cfg/training/yolov7.yaml --epochs 100 --data {dataset.location}/data.yaml --weights 'yolov7.pt' --device 0 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "0W0MpUaTCJro"
   },
   "source": [
    "# Evaluation\n",
    "\n",
    "We can evaluate the performance of our custom training using the provided evalution script.\n",
    "\n",
    "Note we can adjust the below custom arguments. For details, see [the arguments accepted by detect.py](https://github.com/WongKinYiu/yolov7/blob/main/detect.py#L154)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "N4cfnLtTCIce"
   },
   "outputs": [],
   "source": [
    "# Run evaluation\n",
    "%cd {HOME}/yolov7\n",
    "\n",
    "# Get the directory containing the most recent training run\n",
    "EXP_DIR=sorted(os.listdir(\"runs/train\"), key=lambda x: int(x.replace(\"exp\", \"\") or 0))[-1]\n",
    "\n",
    "!python detect.py --weights runs/train/{EXP_DIR}/weights/best.pt --conf 0.1 --source {dataset.location}/test/images"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 964
    },
    "executionInfo": {
     "elapsed": 417,
     "status": "ok",
     "timestamp": 1657651105912,
     "user": {
      "displayName": "Brad Dwyer",
      "userId": "09518147131768799192"
     },
     "user_tz": 300
    },
    "id": "6AGhNOSSHY4_",
    "outputId": "b0e7593f-5c5b-4807-82ab-57ffc65a8ca2"
   },
   "outputs": [],
   "source": [
    "#display inference on ALL test images\n",
    "\n",
    "import glob\n",
    "from IPython.display import Image, display\n",
    "\n",
    "# Get the directory containing the most recent detection run\n",
    "EXP_DIR=sorted(os.listdir(\"runs/detect\"), key=lambda x: int(x.replace(\"exp\", \"\") or 0))[-1]\n",
    "\n",
    "i = 0\n",
    "limit = 10000 # max images to print\n",
    "for imageName in glob.glob(f\"{HOME}/yolov7/runs/detect/{EXP_DIR}/*.jpg\"): #assuming JPG\n",
    "    if i < limit:\n",
    "      display(Image(filename=imageName))\n",
    "      print(\"\\n\")\n",
    "    i = i + 1\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "4jn4kCtgKiGO"
   },
   "source": [
    "# OPTIONAL: Deployment\n",
    "\n",
    "To deploy, you'll need to export your weights and save them to use later."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "wWOok8abrCsL"
   },
   "outputs": [],
   "source": [
    "# optional, zip to download weights and results locally\n",
    "\n",
    "!zip -r export.zip runs/detect\n",
    "!zip -r export.zip runs/train/exp/weights/best.pt\n",
    "!zip export.zip runs/train/exp/*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can now download the `export.zip` file to your computer using the file browser to the left; right click the file and select \"Download\"."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "f41PvE5gKhYw"
   },
   "source": [
    "# OPTIONAL: Active Learning Example\n",
    "\n",
    "Once our first training run is complete, we should use our model to help identify which images are most problematic in order to investigate, annotate, and improve our dataset (and, therefore, model).\n",
    "\n",
    "To do that, we can execute code that automatically uploads images back to our hosted dataset if the image is a specific class or below a given confidence threshold.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "mcINqQS7Kt3-"
   },
   "outputs": [],
   "source": [
    "# # setup access to your workspace\n",
    "# rf = Roboflow(api_key=\"YOUR_API_KEY\")                               # used above to load data\n",
    "# inference_project =  rf.workspace().project(\"YOUR_PROJECT_NAME\")    # used above to load data; change to your own project if you used a dataset from Roboflow Universe\n",
    "# model = inference_project.version(1).model\n",
    "\n",
    "# upload_project = rf.workspace().project(\"YOUR_PROJECT_NAME\")\n",
    "\n",
    "# print(\"inference reference point: \", inference_project)\n",
    "# print(\"upload destination: \", upload_project)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "cEl1NVE3LSD_"
   },
   "outputs": [],
   "source": [
    "# # example upload: if prediction is below a given confidence threshold, upload it \n",
    "\n",
    "# confidence_interval = [10,70]                                   # [lower_bound_percent, upper_bound_percent]\n",
    "\n",
    "# for prediction in predictions:                                  # predictions list to loop through\n",
    "#   if(prediction['confidence'] * 100 >= confidence_interval[0] and \n",
    "#           prediction['confidence'] * 100 <= confidence_interval[1]):\n",
    "        \n",
    "#           # upload on success!\n",
    "#           print(' >> image uploaded!')\n",
    "#           upload_project.upload(image, num_retry_uploads=3)     # upload image in question"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "LVpCFeU-K4gb"
   },
   "source": [
    "# Next steps\n",
    "\n",
    "Congratulations, you've trained a custom YOLOv7 model! Next, start thinking about deploying and [building an MLOps pipeline](https://docs.roboflow.com/?ref=studiolab) so your model gets better the more data it sees in the wild."
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "Training YOLOv7 on Custom Data",
   "provenance": [
    {
     "file_id": "1YnbqOinBZV-c9I7fk_UL6acgnnmkXDMM",
     "timestamp": 1657587444672
    },
    {
     "file_id": "1gDZ2xcTOgR39tGGs-EZ6i3RTs16wmzZQ",
     "timestamp": 1656523193068
    },
    {
     "file_id": "https://github.com/ultralytics/yolov5/blob/master/tutorial.ipynb",
     "timestamp": 1591755516488
    }
   ],
   "toc_visible": true
  },
  "instance_type": "ml.g4dn.xlarge",
  "kernelspec": {
   "display_name": "default:Python",
   "language": "python",
   "name": "conda-env-default-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
